Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libbeat/processors/cache: new processor #36619

Merged
merged 7 commits into from
Sep 25, 2023
Merged

Conversation

efd6
Copy link
Contributor

@efd6 efd6 commented Sep 18, 2023

Proposed commit message

This is the initial infrastructure for a caching metadata processor to be added. It currently only supports in-memory caching, though both planned cache types are configurable with the file cache being mocked by an in-memory cache.

Additional config options are added relative to the RFC, but these are intentionally not documented at this stage.

This is part of the work required for elastic/integrations#2816.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

@efd6 efd6 self-assigned this Sep 18, 2023
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Sep 18, 2023
@efd6
Copy link
Contributor Author

efd6 commented Sep 19, 2023

run elasticsearch-ci/docs

@efd6 efd6 force-pushed the i2816-cache_processor branch 2 times, most recently from 8fd8793 to 42a6910 Compare September 19, 2023 01:09
@elasticmachine
Copy link
Collaborator

elasticmachine commented Sep 19, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-09-25T10:04:47.935+0000

  • Duration: 72 min 14 sec

Test stats 🧪

Test Results
Failed 0
Passed 28332
Skipped 2013
Total 30345

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@efd6
Copy link
Contributor Author

efd6 commented Sep 19, 2023

/package

@efd6
Copy link
Contributor Author

efd6 commented Sep 19, 2023

/test

@efd6 efd6 marked this pull request as ready for review September 19, 2023 05:11
@efd6 efd6 requested review from a team as code owners September 19, 2023 05:11
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@efd6 efd6 force-pushed the i2816-cache_processor branch 2 times, most recently from 01583da to 1aca826 Compare September 19, 2023 05:37
@pierrehilbert pierrehilbert added the Team:Elastic-Agent Label for the Agent team label Sep 19, 2023
This is the initial infrastructure for a caching metadata processor to be added.
It currently only supports in-memory caching, though both planned cache types
are configurable with the file cache being mocked by an in-memory cache.

Additional config options are added relative to the RFC, but these are
intentionally not documented at this stage.
@efd6
Copy link
Contributor Author

efd6 commented Sep 19, 2023

golangci-lint appears to be ignoring the config on windows.

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Left some minor comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this file is not required (especially since the processor doesn't have any of its own fields). Are there any issues if this is removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be ok locally. Trying a full build here.

return s, func() {
// TODO: Consider making this reference counted.
// Currently, what we have is essentially an
// ownership model, where the put operation is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where the put operation is the owner

Or is it the first processor instantiated that is the owner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. Will fix.

// TODO: Consider making this reference counted.
// Currently, what we have is essentially an
// ownership model, where the put operation is
// owner. This could conceivably be problematic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the flexibility afforded to users in their beat configurations and the fact that we allow for dynamic loading/reloading of configs in some areas, I would feel more confident with a reference count.

Comment on lines 263 to 265
// ... and write it into the cloned event.
result = event.Clone()
if _, err = result.PutValue(dst, meta); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to clone? That is an expensive operation. I assume we want to clone so that if PutValue fails we don't leave the event in some partially modified state. Perhaps there are some pre-checks we can perform on the event to ensure that it never fails in PutValue?

Copy link
Contributor Author

@efd6 efd6 Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the reason for it. I managed to significantly lower it to here, the point where the only failure possible is in the actual PutValue operation, and though this is not documented, the implementation of that fails before a mutation, so this should be fine. I'll change it and add a defensive test for the case.

@efd6 efd6 force-pushed the i2816-cache_processor branch 2 times, most recently from c25ce92 to fffd521 Compare September 20, 2023 06:48
Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Can you please add an entry to the codeowners file for this package.

// backing data store is incomplete, and has no put operation defined, the TTL
// will be invalid, but will never be accessed since all time operations outside
// put refer to absolute times. setPutOptions also increases the reference count
// for the memStore for all operation types.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setPutOptions also increases the reference count

That extra behavior was a little surprising. Consider an explicit "retain()" method?

Copy link
Contributor Author

@efd6 efd6 Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was having similar thoughts. This is a consequence of the change to ref counting. I'm reluctant to make this two methods, but I think we get equivalent reduction in confusion by changing the name to add and removing the comment at the call site.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole ref count code is now significantly clearer.

Copy link
Member

@rdner rdner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except some typos and linter issues.

libbeat/processors/cache/mem_store.go Outdated Show resolved Hide resolved
libbeat/processors/cache/mem_store.go Outdated Show resolved Hide resolved
},
1: {
doTo: func(s *memStore) error {
s.Put("one", 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
s.Put("one", 1)
_ = s.Put("one", 1)

Copy link
Contributor Author

@efd6 efd6 Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be flagged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@efd6 this rule is for the github.com/elastic/elastic-agent-libs/mapstr.M type, right? And you have memStore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sorry, you are correct. I've added a nolint comment.

1: {
doTo: func(s *memStore) error {
s.Put("one", 1)
s.Put("two", 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
s.Put("two", 2)
_ = s.Put("two", 2)

doTo: func(s *memStore) error {
s.Put("one", 1)
s.Put("two", 2)
s.Put("three", 3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
s.Put("three", 3)
_ = s.Put("three", 3)

@efd6 efd6 merged commit f111cbc into elastic:main Sep 25, 2023
6 of 8 checks passed
Scholar-Li pushed a commit to Scholar-Li/beats that referenced this pull request Feb 5, 2024
This is the initial infrastructure for a caching metadata processor to be added.
It currently only supports in-memory caching, though both planned cache types
are configurable with the file cache being mocked by an in-memory cache.

Additional config options are added relative to the RFC, but these are
intentionally not documented at this stage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.11-candidate backport-skip Skip notification from the automated backport with mergify enhancement libbeat Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants